---
title: Deploy and monitor DataRobot models in Azure Kubernetes Service
description: Deploy and monitor DataRobot models in Azure Kubernetes Service

---

# Deploy and monitor DataRobot models in Azure Kubernetes Service {: #deploy-and-monitor-datarobot-models-in-azure-kubernetes-service }

!!! info "Availability information"
    The MLOps model package export feature is off by default. Contact your DataRobot representative or administrator for information on enabling this feature for DataRobot MLOps.

    **Feature flag**: Enable MMM model package export


This page shows how to deploy machine learning models on Azure Kubernetes Services (AKS) to create production scoring pipelines with DataRobot's MLOps Portable Prediction Server (PPS).

![](images/int_damdmia_large_015.png)

DataRobot Automated Machine Learning provides a dedicated prediction server as a low-latency, synchronous REST API suitable for real-time predictions. The DataRobot MLOps PPS extends this functionality to serve ML models in container images, giving you portability and control over your ML model deployment architecture.

A containerized PPS is well-suited to deployment in a Kubernetes cluster, allowing you to take advantage of this deployment architecture's auto-scaling and high availability. The combination of PPS and Kubernetes is ideal for volatile, irregular workloads such as those you can find in IoT use cases.

## Create a model {: #create-a-model }

The examples on this page use the [public LendingClub dataset](https://s3.amazonaws.com/datarobot_public_datasets/10K_Lending_Club_Loans.csv) to predict the loan amount for each application.

1. To upload the training data to DataRobot, do either of the following:

    * Click **Local File**, and then select the LendingClub dataset CSV file from your local file system.

    * Click **URL** to open the **Import URL** dialog box and copy the LendingClub dataset URL above:

        ![](images/int_htmsmwdm_large_008.png)

        In the **Import URL** dialog box, paste the LendingClub dataset **URL** and click **Import from URL**:

        ![](images/int_htmsmwdm_large_006.png)

2. Enter `loan_amt` as your target (what you want to predict)(1) and click **Start** (2) to run Autopilot.

    ![](images/int_htmsmwdm_large_009.png)

3. After Autopilot finishes, click **Models** and select a model at the top of the Leaderboard.

4. Under the model you selected, click **Predict > Deploy** to access the **Get model package** download.

5. Click **Download .mlpkg** to start the `.mlpkg` file download.

!!! note
    For more information, see the documentation on the [Portable Prediction Server](portable-pps).

##  Create an image with the model package {: #create-an-image-with-the-model-package }

After you [obtain the PPS base image](portable-pps#obtain-the-pps-docker-image), create a new version of it by creating a Dockerfile with the content below:

``` bash
FROM datarobot/datarobot-portable-prediction-api:<TAG> AS build

COPY . /opt/ml/model
```

!!! note
    For more information on how to structure this Docker command, see the [Docker build](https://docs.docker.com/engine/reference/builder/) documentation.

For the `COPY` command to work, you must have the `.mlpkg` file in the same directory as the Dockerfile. After creating your Dockerfile, run the command below to create a new image that includes the model:

``` bash
docker build . --tag regressionmodel:latest
```

## Create an Azure Container Registry {: #create-an-azure-container-registry }

Before deploying your image to AKS, push it to a container registry such as the Azure Container Registry (ACR) for deployment:

1. In the Azure Portal, click **Create a resource > Containers**, then click **Container Registry.**

2. On the **Create container registry blade**, enter the following:

    | Field | Description |
    |-------|-------------|
    | **Registry name**| Enter a suitable name |
    | **Subscription**| Select your subscription |
    | **Resource group**| Use your existing resource group |
    | **Location**| Select your region |
    | **Admin user**| Enable |
    | **SKU**| Standard |

3. Click **Create**.

4. Navigate to your newly-generated registry and select **Access Keys**.

5. Copy the admin password to authenticate with Docker and push the `.mlpkg` image to this registry.

![](images/int_damdmia_large_006.png)


## Push the model image to ACR {: #push-the-model-image-to-acr }

To push your new image to Azure Container Registry (ACR), log in with the following command (replace `<DOCKER_USERNAME>` with your previously-selected repository name):

``` bash
docker login <DOCKER_USERNAME>.azurecr.io
```

The password is the administrator password you created with the Azure Container Registry (ACR).

Once logged in, make sure your Docker image is correctly tagged, and then push it to the repo with the following command (replace `<DOCKER_USERNAME>` with your previously selected repository name):

``` bash
docker tag regressionmodel <DOCKER_USERNAME>.azurecr.io/regressionmodel
docker push <DOCKER_USERNAME>.azurecr.io/regressionmodel
```

## Create an AKS cluster {: #create-an-aks-cluster }

This section assumes you are familiar with AKS and Azure's Command Line Interface (CLI).

!!! note
    For more information on AKS, see [Microsoft's Quickstart tutorial](https://docs.microsoft.com/en-us/azure/aks/kubernetes-walkthrough){ target=_blank }.

1. If you don't have a running AKS cluster, create one:

    ``` bash
    RESOURCE_GROUP=ai_success_eng
    CLUSTER_NAME=AIEngineeringDemo

    az aks create \
    --resource-group $RESOURCE_GROUP \
    --name $CLUSTER_NAME \
    -s Standard_B2s \
    --node-count 1 \
    --generate-ssh-keys \
    --service-principal XXXXXX \
    --client-secret XXXX \
    --enable-cluster-autoscaler \
    --min-count 1 \
    --max-count 2
    ```

2. Create a secret Docker registry so that AKS can pull images from the private repository. In the command below, replace the following with your actual credentials:

    * `<SECRET_NAME>`

    * `<YOUR_REPOSITORY_NAME>`

    * `<DOCKER_USERNAME>`

    * `<YOUR_SECRET_ADMIN_PW>`

    ``` bash
    kubectl create secret docker-registry <SECRET_NAME> --docker-server=<YOUR_REPOSITORY_NAME>.azurecr.io --docker-username=<DOCKER_USERNAME> --docker-password=<YOUR_SECRET_ADMIN_PW>
    ```

3. Deploy your Portable Prediction Server image. There are many ways to deploy applications, but the easiest method is via the Kubernetes dashboard. Start the Kubernetes dashboard with the following command:

    ``` bash
    az aks browse --resource-group $RESOURCE_GROUP --name $CLUSTER_NAME
    ```

## Deploy a model to AKS {: #deploy-a-model-to-aks }

To install and deploy the PPS containing your model:

1. Click **CREATE > CREATE AN APP**.
2. On the **CREATE AN APP** page, specify the following:

    ![](images/int_damdmia_large_007.png)

    | Field | Value |
    |-------|-------|
    | **App name** | e.g., `portablepredictionserver` |
    | **Container image** | e.g., `aisuccesseng.azurecr.io/regressionmodel:latest` |
    | **Number of pods** | e.g., `1` |
    | **Service** | `External` |
    | **Port**, **Target port**, and **Protocol** | `8080`, `8080`, and `TCP` |
    | **Image pull secret** | previously created |
    | **CPU requirement (cores)** | `1` |
    | **Memory requirement (MiB)** | `250` |

3. Click **Deploy**&mdash;it may take several minutes to deploy.



## Score predictions with Postman {: #score-predictions-with-postman }

To test the model, download the [DataRobot PPS Examples](https://github.com/datarobot-community/ai_engineering/blob/master/mlops/DRMLOps_PortablePredictionServer_examples/DR%20MLOps%20Portable%20Prediction%20Server%20Public.postman_collection.json){ target=_blank } a [Postman Collection](https://www.postman.com/collection/){ target=_blank }, and update the hostname from `localhost` to the external IP address assigned to your service. You can find the IP address in the **Services** tab on your Kubernetes dashboard:

![](images/int_damdmia_large.png)

To make a prediction, execute the make predictions request:

![](images/int_damdmia_large_009.png)

## Configure autoscaling and high availability {: #configure-autoscaling-and-high-availability }

Kubernetes supports horizontal pod autoscaling to adjust the number of pods in a deployment depending on CPU utilization or other selected metrics. The Metrics Server provides resource utilization to Kubernetes and is automatically deployed in AKS clusters.

In the previous sections, you deployed one pod for your service and defined only the minimum requirement for CPU and memory resources.

To use the autoscaler, you must define CPU requests and utilization limits.

By default, the Portable Prediction Server spins up one worker, which means it can handle only one HTTP request simultaneously. The number of workers you can run, and thus the number of HTTP requests that it can handle simultaneously, is tied to the number of CPU cores available for the container.

Because you set the minimum CPU requirement to `1`, you can now set the limit to `2` in the `patchSpec.yaml` file:

``` yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: portablepredictionserver
spec:
  selector:
    matchLabels:
      app: portablepredictionserver
  replicas: 1
  template:
    metadata:
      labels:
        app: portablepredictionserver
    spec:
      containers:
      - name: portablepredictionserver
        image: aisuccesseng.azurecr.io/regressionmodel:latest
        ports:
        - containerPort: 8080
        resources:
          requests:
            cpu: 1000m
          limits:
            cpu: 2000m
      imagePullSecrets:
      - name: aisuccesseng
```

Then run the following command:

``` bash
kubectl patch deployment portablepredictionserver --patch "$(cat patchSpec.yaml)"
```

Alternatively, you can update the deployment directly in the Kubernetes dashboard by editing the JSON as shown below and clicking **UPDATE**.

![](images/int_damdmia_large_008.png)

Now that the CPU limits are defined, you can configure autoscaling with the following command:

``` bash
kubectl autoscale deployment portablepredictionserver --cpu-percent=50 --min=1 --max=10
```
This enables Kubernetes to autoscale the number of pods in the `portablepredictionserver` deployment. If the average CPU utilization across all pods exceeds 50% of their requested usage, the autoscaler increases the pods from a minimum of one instance up to ten instances.

To run a load test, download the sample JMeter test plan below and update the URLs/ authentication. Run it with the following command:

``` bash
jmeter -n -t LoadTesting.jmx -l results.csv
```

The output will look similar to the following example:

![](images/int_damdmia_large_017.png)

## Report usage to DataRobot MLOps via monitoring agents {: #report-usage-to-datrobot-mlops-via-monitoring-agents }

After deploying your model to AKS, you can monitor this model, along with all of your other models, in one central dashboard by reporting the telemetrics for these predictions to your DataRobot MLOps server and dashboards.

1. Navigate to the **Model Registry > Model Packages > Add New Package** and follow the instructions in the [documentation](reg-create#register-external-model-packages).

    ![](images/int_damdmogcp_large_011.png)

2. Select **Add new external model package** and specify a package name and description (1 and 2), upload the corresponding training data for drift tracking (3), and identify the model location (4), target (5), environment (6), and prediction type (7), then click **Create package** (8).

    ![](images/int_damdmia_large_012.png)

3. After creating the external model package, note the model ID in the URL as shown below (blurred in the image for security purposes).

    ![](images/int_htmsmwdm_large_017.png)

4. While still on the **Model Registry** page and within the expanded new package, select the **Deployments** tab and click **Create new deployment**.

    ![](images/int_damdmogcp_add1.png)

    The deployment page loads prefilled with information from the model package you created.

5. Complete any missing information for the deployment and click **Create deployment**.

6. Navigate to **Deployments > Overview** and copy the deployment ID (from the URL).
  
    ![](images/int_htmsmwdm_large_018.png)

Now that you have your model ID and deployment ID, you can report predictions as described in the next section.

### Report prediction details {: #report-prediction-details }

To report prediction details to DataRobot, you need to provide a few environment variables to your Portable Prediction Server container.

Update the deployment directly in the Kubernetes dashboard by editing the JSON and then clicking **UPDATE**:


``` json
"env": [
             {
                "name": "PORTABLE_PREDICTION_API_WORKERS_NUMBER",
                "value": "2"
              },
              {
                "name": "PORTABLE_PREDICTION_API_MONITORING_ACTIVE",
                "value": "True"
              },
              {
                "name": "PORTABLE_PREDICTION_API_MONITORING_SETTINGS",
                "value": "output_type=output_dir;path=/tmp;max_files=50;file_max_size=10240000;model_id=<modelId>;deployment_id=<deployment_id>"
              },
              {
                "name": "MONITORING_AGENT",
                "value": "False"
              },
              {
                "name": "MONITORING_AGENT_DATAROBOT_APP_URL",
                "value": "https://app.datarobot.com/"
              },
              {
                "name": "MONITORING_AGENT_DATAROBOT_APP_TOKEN",
                "value": "<YOUR TOKEN>"
              }
]
```

![](images/int_damdmia_large_010.png)

Even though you deployed a model outside of DataRobot on a Kubernetes cluster (AKS), you can monitor it like any other model and track service health and data drift in one central dashboard (see below).

![](images/int_htmsmwdm_large_012.png)
